Audjest Column name

Data cleaning

Deal with nan value

Data visualization

Univarieate Analysis

1)Histogram

Insight

1)We can see that maximum count of customer of age who have account in bank is bitween 25 to 40.

2)After 55 age of customers,there is very minimum no of account.

Observation= Maximum no of salary of customers are 10k-20k, 50k-70k and 1lakh-1.20lakh

2) Box plot

observation = Range of salary is 20k to 70k and there is no outliers in salary and median of salary is 60k.

observation = Maximum customers save there salary bitween 0 to 1500 rs and there are some customers who save large amount of there salary.

observation= Age of customers bitween 32 to 47 have maximum no of account in this bank

3)Count plot

Maximum customer do use to take loan from bank apprx 7000 taken loan from bank out of 35000

Maximum customer do use to take loan from bank apprx 7000 taken loan from bank out of 35000

obseravation =1) 25000 customer take house loan out of 45211 customers 2) 20000 cutomer do not take house loan

observation = 1)28k customers are maried

2)13k customers are single

3)5k customers are divorced

only 8% customer responed after call out of 100%

4) scatter plot

In scatter plot we can see that maximum peopal save there money is below 20000 and the age of customer between 30 to 60 saves more money.

5)pair plot

6) Hexin plot

7) Bar plot

maximum no. of divorced peopal have salary above 60,000

Maximum no of maried peopal save there salary.

8) Distribution plot

maximum no of customer's earning salary are 20k, 60k and 1lakh.

approx 2000 customers save there salary up to 1500

9) Heat map

Statistical manipulation

1. Find the correlation between the columns and draw the observations from it.

Observation= 1)Costomer id is directly proportional to previous day 2)Maximum column are not proportonal to each other, show almost 0 relation with each other

2. What is the mean age and duration time of the customers with respect to every column?

3. Find the mean and median of every column response wise and draw the observations.

Observation= In customer's balance and previous day column there is a large difference b/n mean and meadian,it means this column consist maximum no. of outliears

4)Show that the columns are following the Normal Distribution or not, if not following try to convert it non-normal to normal.

Converting salary column to normal disritution

2 Age of customer

Balance column

converting to normal distribution

Now its looking normal distribution bell like curve

5)Find the Best features using correlation and Chi-square test.

Chi-square test

Trying to find relation between Job and house loan

6)Find the probabilities with respect to the job role and education with customer responses.

Marginal probability

Probability of eduation with respect to resopnse of call

Let’s check if we have any statistical patterns in the Data frame (using plots or analysis).

observation = In bar plot we can see that average salary of married customers are 57k and saving is more for single and divorced people